Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Global image captioning method based on graph attention network
Jiahong SUI, Yingchi MAO, Huimin YU, Zicheng WANG, Ping PING
Journal of Computer Applications    2023, 43 (5): 1409-1415.   DOI: 10.11772/j.issn.1001-9081.2022040513
Abstract276)   HTML22)    PDF (2508KB)(174)       Save

The existing image captioning methods only focus on the grid spatial location features without enough grid feature interaction and full use of image global features. To generate higher-quality image captions, a global image captioning method based on Graph ATtention network (GAT) was proposed. Firstly, a multi-layer Convolutional Neural Network (CNN) was utilized for visual encoding, extracting the grid features and entire image features of the given image and building a grid feature interaction graph. Then, by using GAT, the feature extraction problem was transformed into a node classification problem, including a global node and many local nodes, and the global and local features were able to be fully utilized after updating the optimization. Finally, through the Transformer-based decoding module, the improved visual features were adopted to realize image captioning. Experimental results on the Microsoft COCO dataset demonstrated that the proposed method effectively captured the global and local features of the image, achieving 133.1% in CIDEr (Consensus-based Image Description Evaluation) metric. It can be seen that the proposed image captioning method is effective in improving the accuracy of image captioning, thus allowing processing tasks such as classification, retrieval, and analysis of images by words.

Table and Figures | Reference | Related Articles | Metrics